Search CORE

12 research outputs found

SMART-KG: Hybrid Shipping for SPARQL Querying on the Web

Author: Acosta Maribel
Aluç Güneş
Aranda Carlos Buil
Bonatti Piero Andrea
Buil-Aranda Carlos
Erling Orri
Hartig Olaf
Hasnain Ali
Heling Lars
Hernández-Illera A.
Martínez-Prieto M.A.
Meimaris M.
Polleres Axel
Saleem Muhammad
Publication venue: ACM Digital Library
Publication date: 01/01/2020
Field of study

While Linked Data (LD) provides standards for publishing (RDF) and (SPARQL) querying Knowledge Graphs (KGs) on the Web, serving, accessing and processing such open, decentralized KGs is often practically impossible, as query timeouts on publicly available SPARQL endpoints show. Alternative solutions such as Triple Pattern Fragments (TPF) attempt to tackle the problem of availability by pushing query processing workload to the client side, but suffer from unnecessary transfer of irrelevant data on complex queries with large intermediate results. In this paper we present smart-KG, a novel approach to share the load between servers and clients, while significantly reducing data transfer volume, by combining TPF with shipping compressed KG partitions. Our evaluations show that smart-KG outperforms state-of-the-art client-side solutions and increases server-side availability towards more cost-effective and balanced hosting of open and decentralized KGs

Crossref

KITopen

LETEO: Scalable anonymization of big data and its application to learning analytics

Author: Buil Aranda Carlos
Etcheverry Lorena
Giménez Eduardo
Olmedo Federico
Pastorini Marcos
Toro Matías
Publication venue: Udelar. FI.
Publication date: 01/01/2021
Field of study

ANII Fondo sectorial de investigación con datos - 2018Created in 2007, Plan Ceibal is an inclusion and equal opportunities plan with the aim of supporting Uruguayan educational policies with technology. Throughout these years, and within the framework of its tasks, Ceibal has an important amount of data related to the use of technology in education, necessary to manage the plan and fulfill the assigned legal tasks. However, the data does not they can be studied without accounting for the problem of de identifying the users of the Plan. To exploit this data, Ceibal has deployed an instance of the Hortonworks Data Platform (HDP), a open source platform for the storage and parallel processing of massive data (big data). HDP offers a wide range of functional components ranging from large file storage (HDFS) to distributed programming of machine learning algorithms (Apache Spark / MLlib). However, as of today there are no solutions for the de-identification of personal code data open and integrated into the Hortonworks ecosystem. On the one hand, the deidentification tools existing data have not been designed so that they can easily scale to large volumes of data, and they also do not offer easy integration mechanisms with HDFS. This forces you to export the data outside of the platform that stores them to be able to anonymize them, with the consequent risk of exposure of confidential information. On the other hand, the few integrated solutions in the Hortonworks ecosystem are owners and the cost of their licenses is very significant. The objective of this project is to promote the use of the enormous amount of educational and technological data that Ceibal possesses, lifting one of the greatest obstacles that exist for that, namely, the preservation of privacy and the protection of the personal data of the beneficiaries of the Plan. To this end, this project seeks to generate anonymization tools that extend the HDP platform. On In particular, it seeks to develop open source modules to integrate into said platform, which implement a set of programmed anonymization techniques and algorithms in a distributed manner using Apache Spark and that can be applied to data sets stored in HDFS files

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

ADMIRE D1.5 textendash Report defining an iteration of the model and language: PM3 and DL3

Author: Aranda Carlos Buil
Atkinson Malcolm
Brezany Peter
Janciak Ivan
van Hemert Jano
Woehrer Alexander
Publication venue
Publication date: 01/09/2009
Field of study

Edinburgh Research Explorer

Development of a Semantic Web Solution for Directory Services

Author: Buil Aranda Carlos
Publication venue: Institutt for datateknikk og informasjonsvitenskap
Publication date: 01/01/2005
Field of study

The motivation for this work is based in a common problem in organizations. The problem is to access and to manage the growing amount of stored data in companies. Companies can take advantage with the utilization of the emerging Semantic Web technology in order to solve this problem. Invenio AS is in a situation where it is necessary to access a directory service in an efficient way and the Semantic Web languages can be used to solve it. In this thesis, a literature study has been done, an investigation about the main ontology languages proposed by World Wide Web Consortium, RDF(S) and OWL with its extension for Web services OWL-S and the ontology language proposed by the International Organization for Standardization, Topic Maps. This literature study can be used like an introduction to these Web ontology languages RDF, OWL (and OWL-S) and Topic Maps. A model of the databases has been extracted and designed in UML. The extracted model has been used to create a common ontology, merging both the initial databases. The ontology that represents the database in the three languages has been analysed. The quality and semantic accuracy of the languages for the Invenio case has been analysed and we have obtained detailed results from this analysis

NORA - Norwegian Open Research Archives

Development of a Semantic Web Solution for Directory Services

Author: Buil Aranda Carlos
Publication venue: Institutt for datateknikk og informasjonsvitenskap
Publication date: 01/01/2005
Field of study

Croatian Digital Thesis Repository

University of Zagreb Repository

Situated Support for Choice of Representation

Author: Anders Kofod-Petersen
Carlos Buil Aranda
Sari E Hakkarainen
Publication venue
Publication date: 24/04/2020
Field of study

CiteSeerX

PromoterLCNN: A Light CNN-Based Promoter Prediction and Classification Model

Author: Carlos Buil-Aranda
Daryl Hernández
Mauricio Araya
Nicolás Jara
Roberto E. Durán
Publication venue: MDPI AG
Publication date: 01/06/2022
Field of study

Promoter identification is a fundamental step in understanding bacterial gene regulation mechanisms. However, accurate and fast classification of bacterial promoters continues to be challenging. New methods based on deep convolutional networks have been applied to identify and classify bacterial promoters recognized by sigma (σ) factors and RNA polymerase subunits which increase affinity to specific DNA sequences to modulate transcription and respond to nutritional or environmental changes. This work presents a new multiclass promoter prediction model by using convolutional neural networks (CNNs), denoted as PromoterLCNN, which classifies Escherichia coli promoters into subclasses σ70, σ24, σ32, σ38, σ28, and σ54. We present a light, fast, and simple two-stage multiclass CNN architecture for promoter identification and classification. Training and testing were performed on a benchmark dataset, part of RegulonDB. Comparative performance of PromoterLCNN against other CNN-based classifiers using four parameters (Acc, Sn, Sp, MCC) resulted in similar or better performance than those that commonly use cascade architecture, reducing time by approximately 30–90% for training, prediction, and hyperparameter optimization without compromising classification quality

Directory of Open Access Journals

PubMed Central

SPARQLES: monitoring public SPARQL endpoints

Author: Buil Aranda Carlos
Hogan Aidan
Matteis Luca
Umbrich Jüergen
Vandenbussche Pierre Yves
Publication venue: 'IOS Press'
Publication date: 01/01/2017
Field of study

We describe SPARQLES: an online system that monitors the health of public SPARQL endpoints on the Web by probing them with custom-designed queries at regular intervals. We present the architecture of SPARQLES and the variety of analytics that it runs over public SPARQL endpoints, categorised by availability, discoverability, performance and interoperability. We also detail the interfaces that the system provides for human and software agents to learn more about the recent history and current state of an individual SPARQL endpoint or about overall trends concerning the maturity of all endpoints monitored by the system. We likewise present some details of the performance of the system and the impact it has had thus far.Fujitsu Laboratories Limited CONICYT/FONDECYT Project 3130617 FONDECYT Project 11140900 DGIP Project 116.24.1 Millennium Nucleus Center for Semantic Web Research NC12000

Repositorio Académico de la Universidad de Chile

GenoVi, an open-source automated circular genome visualizer for bacteria and archaea.

Author: Andrea Rodríguez-Delherbe
Andrés Cumsille
Beatriz Cámara
Carlos Buil-Aranda
Mauricio Araya
Michael Seeger
Nicolás Jara
Roberto E Durán
Vicente Saona-Urmeneta
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/04/2023
Field of study

The increase in microbial sequenced genomes from pure cultures and metagenomic samples reflects the current attainability of whole-genome and shotgun sequencing methods. However, software for genome visualization still lacks automation, integration of different analyses, and customizable options for non-experienced users. In this study, we introduce GenoVi, a Python command-line tool able to create custom circular genome representations for the analysis and visualization of microbial genomes and sequence elements. It is designed to work with complete or draft genomes, featuring customizable options including 25 different built-in color palettes (including 5 color-blind safe palettes), text formatting options, and automatic scaling for complete genomes or sequence elements with more than one replicon/sequence. Using a Genbank format file as the input file or multiple files within a directory, GenoVi (i) visualizes genomic features from the GenBank annotation file, (ii) integrates a Cluster of Orthologs Group (COG) categories analysis using DeepNOG, (iii) automatically scales the visualization of each replicon of complete genomes or multiple sequence elements, (iv) and generates COG histograms, COG frequency heatmaps and output tables including general stats of each replicon or contig processed. GenoVi's potential was assessed by analyzing single and multiple genomes of Bacteria and Archaea. Paraburkholderia genomes were analyzed to obtain a fast classification of replicons in large multipartite genomes. GenoVi works as an easy-to-use command-line tool and provides customizable options to automatically generate genomic maps for scientific publications, educational resources, and outreach activities. GenoVi is freely available and can be downloaded from https://github.com/robotoD/GenoVi

Directory of Open Access Journals

D1.1.3: NeOn Formalisms for Modularization: Syntax, Semantics, Algebra

Author: Buil Aranda Carlos
Caracciolo Caterina
d'Aquin Mathieu
Dzbor Martin
Euzenat Jérôme
Haase Peter
Iglesias Marta
Jacques Yves
Jose Manuel Gomez
Rudolph Sebastian
Zimmermann Antoine
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

The goal of this document is to come up with a formalism for ontology modularization, including syntaxes andthe fundamental properties of a semantics of such a formalism. Furthermore we introduce operators to create,combine and manipulate ontology modules and give formal definitions for these operators based on the semanticsof ontology modules. The definition of the NeOn formalism for modularization and of the operators to manipulateontology modules are guided by a number of use cases and examples, from NeOn cases studies and other workpackages

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server